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ABSTRACT 

A/'-body simulations are essential for understanding the formation and evolution of structure 
in the Universe. However, the discrete nature of these simulations affects their accuracy 
when modelling collisionless systems. We introduce a new approach to simulate the 
gravitational evolution of cold collisionless fluids by solving the Vlasov-Poisson equations 
in terms of adaptively reflneable “Lagrangian phase space elements”. These geometrical 
elements are piecewise smooth maps between Lagrangian space and Eulerian phase space 
and approximate the continuum structure of the distribution function. They allow for 
dynamical adaptive splitting to accurately follow the evolution even in regions of very 
strong mixing. 

We discuss in detail various one-, two- and three-dimensional test problems to demonstrate 
the performance of our method. Its advantages compared to Wbody algorithms are: i) 
explicit tracking of the fine-grained distribution function, ii) natural representation of 
caustics, iii) intrinsically smooth gravitational potential fields, thus iv) eliminating the 
need for any type of ad-hoc force softening. 

We show the potential of our method by simulating structure formation in a warm dark 
matter scenario. We discuss how spurious collisionality and large-scale discreteness noise of 
Wbody methods are both strongly suppressed, which eliminates the artificial fragmentation 
of filaments. 

Therefore, we argue that our new approach improves on the Wbody method when 
simulating self-gravitating cold and collisionless fluids, and is the first method that allows 
to explicitly follow the fine-grained evolution in six-dimensional phase space. 

Key words: cosmology: dark matter - cosmology: large-scale structure of the Universe - 
cosmology: theory - galaxies: kinematics and dynamics - methods: numerical 


1 INTRODUCTION 

Numerical simulations lie at the very heart of contemporary 
cosmology. They are the only method that can accurately follow 
the growth of small primordial density fluctuations into the 
highly nonlinear objects that populate the low-redshift Universe 


(e.g. Davis et al.|1985 

Efstathiou et al.|1985 Bertschinger|1998 

jSpringel et al.||2005 

Angulo et al.||2012 

). As such, they have 


proven an indispensable tool in the formulation of our theory 
of cosmological structure formation and in the validation of 
the ACDM model. 


Since most of the mass in the Universe appears to be in 
the form of dark matter (DM; a fundamental particle with 
a negligible non-gravitational interaction cross-section with 
both itself and baryonic matter), numerical simulations that 
only follow gravitational forces were the natural first tool em¬ 
ployed by pioneer numerical cosmologists. Since the 1970s, 
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these simulations have progressively increased their scope and 
accuracy, nowadays spanning a huge dynamic range. State- 
of-the-art simulations employ trillions of bodies to describe 
volumes comparable to the observable Universe, while resolv¬ 
ing the collapsed DM structures that could host the faintest 
galaxies (see e.g. Heitmann et al.||2015| [Skillman et al.||2014 


Ishiyama et al.||2015 for recent examples) 

A milestone in the history of gravity-only simulations was 
the establishment of a universal form for the density profile 
of collapsed dark matter haloes ( Navarro et al.p996 1997). 
Other important results were the accurate characterisation of 
the hierarchy of nonlinear correlation functions, of the cosmic 
velocity field, precise estimates of nonlinear effects on the BAO 
peak and the universal form for the abundance of dark matter 
haloes, which secure their use for cosmological parameter esti¬ 
mation (see the reviews of Kuhlen et al.||2012 Frenk & White 
2012 and references therein). 

The central role of simulations in cosmology is in part due 
to the apparent robustness of the methods employed. Their re¬ 
sults seem to hold even when extremely different time, mass and 
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2 0. Hahn & R. E. Angulo 


force resolution are used. For instance, regarding the functional 
form and universality of halo density profiles, state-of-the-art 
simulations show a remarkable agreement with those carried 
out 20 years ago, despite employing 5 orders of magnitude more 


particles, and 100 times better force resolution (e.g. Diemand 
et al.|2008 Springel et al.||2008 Gao et ar]|2012 ). It is worth 


noting that, despite some recent hints (Pontzen & Governato 


2013 Ludlow et al. 2013 2014), the fundamental origin of 


the halo density profile is still unknown, and even a purely 
numerical one has been suggested (e.g. Baushev|2015 ). 


1.1 The Collisionless Boltzmann equation 

Dark matter (DM) in the Universe can be considered as a 
collisionless fluid. Even in the case of a (heavy) 100 GeV DM 
particle, the mean comoving number density of particles is 
~ 10^®pc“^. This obviously implies that dark matter, just 
like baryons, can be treated as a fluid on cosmological and 
astrophysical scales, where the microphysical particle character 
is irrelevant. Additionally, DM appears to have an extremely 
weak interaction cross-section with itself or with ordinary mat¬ 
ter. Therefore, the DM fluid can be regarded as collisionless. 
These properties hold for most particle dark matter candidates, 
even for warm ones such as e.g. sterile neutrinos. 

A self-gravitating collisionless fluid is fully described by the 
evolution of its six-dimensional phase-space density /(x, p,t), 
which is governed by the collisionless Boltzmann equation 
(GBE): 


0 = 


d/(x,p,^) 

dt 


^ P 

dt ma^ 


‘ ^xf Tfl^x4^ ' ^pf •) 


( 1 ) 


supplemented with Poisson’s equation for the Newtonian grav¬ 
itational potential 0 


AirGrrip 

a 



( 2 ) 


where rrip is the (microphysical) particle mass, fi is the mean 
number density of dark matter particles in the Universe, G is 
the gravitational constant and a is the cosmological scale factor 
that itself obeys the first Eriedmann equation. The system of 
the GBE with a self-coupling described by Poisson’s equation 
is commonly termed the Vlasov-Poisson system of equations. 
While we focus here on modelling the DM dynamics, we note 
that Vlasov-Poisson describes all collisionless fluids with a 1/r 
potential and thus the discussion presented in this paper also 
applies to those cases. 

One can consider two limits for the initial distribution 
function, /(x,p,t = 0). In the hot limit, the problem is mani¬ 
festly six-dimensional, where the thermal width gives a natural 
scale in velocity space that one can aim to resolve. Quite 
in contrast, in the cold limit, i.e. when the thermal particle 
velocity dispersion is much smaller than the bulk velocities 
arising from gravitational instability (which is an excellent 
approximation for both cold and warm DM candidate^, the 


^ In fact, for a 1 keV particle, the RMS velocity is only around 
~ 0.4km/s at 2 ; = 9 ( [Bode et al.|200l] ) when structure formation 
starts to become nonlinear. This velocity further decays and is orders 
of magnitude smaller than the velocities arising from collapse. The 
signature of the thermal velocity is the truncation of the spectrum, 
which represents the largest Jeans mass before the particles start to 
cool. At later times, the particles are never substantially heated again 


problem is only three-dimensional: the thermal width is zero 
and thus the fluid occupies only a three-dimensional submani¬ 
fold of phase space (cf. e.g. 

This means that the distribution function is a smooth (i.e. 
differentiable) three-dimensional hypersurface without holes 
in six-dimensional phase-space. The Vlasov-Poisson equation 
guarantees that it remains smooth and that no holes will 
emerge in phase space during its evolution. In particular, one 
can parameterise this submanifold Q in terms of Lagrangian 
coordinates q (so that it is customary to call it the “Lagrangian 
manifold”) and express the mapping between Lagrangian and 
Eulerian phase space as 


Zeldovich|1970 Arnold et al.|1982). 


sc: 


q 1 -^ (> 


( 3 ) 


whose evolution describes the full evolution of the fluid in 
both the monokinetic (single-stream) and the multi-stream 
regime. Despite the cold fine-grained distribution function, 
this three dimensional hyper-surface can become “dense” in 
phase space through gravitational evolution and mixing, which 
means that it will cover more and more of phase space over 
time and approach a state close to a macroscopically “hot” 
system. Once the phase space is densely covered (meaning that 
the distance in phase space between different regions of the 
sheet becomes small compared to the extent of the system), 
one can expect that the system does not change any more on 
time scales on which particles move, which means that it will 
approach ergodicity in those regions. 

Unfortunately, the solution of the set of equations in the 
hot limit without further approximations is extremely expen¬ 
sive computationally, and an explicit 6D solution has been 


obtained on ly for a handful of simplified cases (e.g. Yoshikawa 
et al.||2013 using a 64P grid). The situation is in some sense 


even worse in the cold limit, since both spatial and velocity 
resolution achievable with such a coarse-grained approach are 
completely insufficient to resolve the dynamics of a cold system 
without being dominated by diffusion in phase space. Addi¬ 
tionally, because of the collisionless nature of the DM fluid, 
Boltzmann’s H-theorem does not apply (in the sense that it 
is not to be expected that the system approaches a maximum 
entropy state described by a single “temperature” on short- 
enough time scales) and hence the GBE cannot be replaced 
by a low-order moment expansion in terms of fluid equations 
with a simple closure condition such as an effective pressure 
or effective viscosity (although such approaches are applied on 
mildly non-linear scales, e.g. [Garrasco et al.||2Q12| [Hertzber^ 
|2014[ but the effective fluid properties have not been deter¬ 
mined from first principles, and are provided by virialisation 
on smaller scales rather than a conventional thermalisation). 

To make progress, it has thus been customary to postpone 
the goal of following the exact evolution of the fine-grained 
distribution function and instead to solve the Vlasov-Poisson 
system using a Monte-Garlo approach. The idea is to sample 
the distribution function at N discrete locations q^ (z = 1... V) 
with massive bodies (in what can be thought of as a “coarse- 
graining” of the initial conditions over macroscopic volume 


since they remain locally (on their stream) roughly at the cosmic 
mean density even inside collapsed structures (cf. [Vogelsberger &:| 
|White|2011| >. In very good approximation, it thus suffices to consider 
all (non-relativistic) dark matter particles as a cold fluid. 


© 0000 RAS, MNRAS 000 , 000-000 









































Refined phase-space elements 3 


elements 3V containing a mass m of microscopic particles, i.e. 
m = uipfiSV): 


N 

/jv(x, p, t) = ^ 5d (x - Xi{t)) Sd (p - Pi(^)) , (4) 

i=l 

for which the Vlasov-Poisson system then reduces to the equa¬ 
tions of motion for a Hamiltonian V-body system, i.e. the phase 
space density /w is trivially conserved along the characteristics 
Pi{t) defined by: 


and Pi = —m 'Vx 


(5) 


The equations of motion are thus easy to solve if the gravita¬ 
tional potential is known. 

Inserting the distribution function Q into Poisson’s equa¬ 
tion and convolving with the Green’s function of the 
Laplacian G{r) = — l/47rr, the potential is given by the New¬ 
tonian potential of N point masses. The initial coarse graining 
however implied that the mass is not concentrated in one point, 
but spread out over the volume SV, and so the unbounded 
two-body force is certainly a bad approximation to the contin¬ 
uous collisionless system. The evolution of the volume SV is 
non-trivial and so an approximation is made by replacing the 
point mass potential e.g. with a Plummer softened potential 




Gm 


y(x-Xj)2 +(;2’ 


(6) 


where e is the softening length (which may evolve over time). 
In principle initially e ~ is of the order of the mean 

particle separation and becomes smaller in regions of collaps^ 
Since its evolution is not followed, it is usually treated as a 
somewhat arbitrary parameter to soften the force field on a 
small fraction of the mean particle separation scale that seeks 
to suppress the discrete nature of the system while allowing 
for the highest force “resolution” possible. The validity of this 
approach is then typically inferred from convergence studies in 
which N and e are varied. In fact, the method we present in 
this paper can be viewed as following the exact evolution of 
the SV volume elements over time. 

The evolution of the V-body system is thus assumed to 
be analogous to the evolution of a macroscopic group of mi- 
crophysical dark matter particles, and therefore, it is assumed 
that it is an actual solution to the CBE. This is clearly true 
only when N goes to infinity and the force softening e goes 
to zero. For a finite number of particles, there are collision 
terms that would affect the details of dynamics such as chaotic 
and phase mixing, Landau damping and two-body interactions 
among others. Furthermore, it is not possible to guarantee an 
upper bound to the error introduced. This can lead to solutions 
dominated by numerical artefacts. 

Over the years, many authors have pointed out cosmologi¬ 
cal set-ups in which the discrete and collisional nature of the 


^ In fact, SV represents the phase space volume associated with a 
particle and will remain of the order of the mean particle separa¬ 
tion before shell crossing. After shell-crossing in regions of multi¬ 
streaming, it is still on the order of the mean separation, but the 
mean separation of particles on the same stream, not the mean 
separation after projection to configuration space as is done when 
adaptive softening is used (either by adaptive softening lengths or 
in AMR). It is thus hopeless to estimate e more accurately using 
only configuration space properties. 


A-body solution can be appreciated (e.g. 

Centrella & Melott 

1983| 

Centrella et al.|1988 |Melott & Shane 

arin|1989 |Diemand 

et al. 

2004||Melott et al.|1997| Splinter et al. |1 998 11 Wang et al.| 

2007| |Melott 2007| |Marcos 2008|). There are two situations 


worth mentioning. The first is simulations that feature a small- 
scale cut-off in their primordial density power spectrum and in 
which the A^-body method produces a large number of spurious 
clumps that resemble dark matter haloes (usually known as 
“artificial filament fragmentation”). Another example is when 
the gravitational interaction of two fluids that start with a dif¬ 
ferent power spectrum is followed, e.g. baryons and dark matter 
in the early Universe. In this case, two-body interactions make 
the fluids collisional resulting in an incorrect evolution of their 
relative spatial distribution ( Yoshida et al.||2003 Angulo et al. 


2013| . In all these cases problems are most severe if e is smaller 
than the mean particle separation (which is however clearly the 
case in all production run A^-body simulations where typically 
e is 1/20 to 1/60 of the mean particle separation). 

It is not yet clear how the problems described above affect 
standard single-fluid CDM simulations. However, there is a 
reasonable amount of concern about the correctness and accu¬ 
racy of A/’-body simulations. The importance that numerical 
simulations have for modern cosmology certainly warrants a 
level of skepticism and motivates the search for alternative 
methods that do not suffer from the problems of the A/’-body 
method. 


1.2 A phase-space element method 


In order to overcome the limitations of the traditional N-hody 
approach, various improvements and alternative schemes have 
been proposed over the years. 

Most A-body simulations employ a softening scale that 
is either constant or varies only over time. Several attempts 
have been made using spatially varying adaptive softening 
(e.g. Bagla fc Khandai||2009 lannuzzi fc Dolag]|2011 ), which 
suppresses two-body interactions more efficiently. Adaptive 
softening is, however, performed isotropically, which implies 
that it is not expected to improve dramatically the situation 
in regions of anisotropic collapse (which in cosmological simu¬ 
lation is basically everywhere except inside of haloes). 

Among the methodological alternatives to the A/’-body 
method are e.g. a reformulation of Vlasov-Poisson as a 


Schrodinger equation (e.g. [Widrow Kais^ 


1993 


Schaller 


|et al.|20l4| |Uhlemann et al.|2014| for classical and more recent 


examples), the ID waterbag method (e.g. Colombi ^ Touma 


2014 Colombi 2015), as well as the Lagrangian tessellation 


method of Hahn et al. (2013), hereafter HAK13 


The method proposed in HAK13 (referred to as “TetPM” 
by the authors) attempts to retain the continuous nature of the 
distribution function / by performing a Delaunay tessellation 
of the A/’-body particles in Lagrangian space (based on the 
Lagrangian tessellation idea of |Abel et al.||2012| and [Shandarin| 
et al.||2012 ). HAK13 demonstrated that by assigning the mass 


to tetrahedral volume elements, whose vertices are free-falling 
particles, anisotropic deformation can be followed much more 
accurately and the obvious discreteness effects of the A/’-body 
method, such as two-body collisions and spurious fragmenta¬ 
tion, could be overcome. However, a serious shortcoming of the 
method was also noted by HAK13: approximating the manifold 
by tetrahedra is accurate in the mildly nonlinear regime, but 
not in strongly non-linear stages. In this regime, the volume 
of / grows faster than what can be followed by the motion of 
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4 0. Hahn & R. E. Angulo 


the tetrahedra. This leads to a violation of the Hamiltonian 
structure of the equations, and thus it implies a lack of energy 
conservation. This can manifest itself in an overestimation of 
the density in the centre of haloes (HAK13), and it also affects 
the location of material tidally disrupted from infalling dark 
matter substructure ( Angulo et al.||2014 ). 

In this article, we present a general class of ’’Lagrangian 
phase space element methods” to solve the Vlasov-Poisson sys¬ 
tem with cold initial conditions. We will show that the approach 
of HAK13 is naturally included as the lowest order method that 
provides a continuous approximation to the distribution func¬ 
tion. When higher order elements are employed, the evolution 
of phase-space elements can be captured much more accurately, 
and caustics can be explicitly represented. Furthermore, we 
will demonstrate that an adaptive refinement of phase space 
elements allows to fully track their distortion, even during late 
highly nonlinear stages. This guarantees energy conservation 
and a direct control of errors. Therefore, we will argue that 
our method offers an interesting alternative to solve the CBE 
while fully resolving the evolution of the full phase-space dis¬ 
tribution function. The possibility to have the full fine-grained 
distribution function at hand offers exciting possibilities to 
study the behaviour of self-gravitating collisionless systems in 
much more detail than can be extracted from Wbody data. 

Our paper is structured as follows. In Section we in¬ 
troduce the phase space element method, discuss how we im¬ 
plemented the projection to configuration space for the mass 
deposit, discuss discretisation errors, and how they can be lim¬ 
ited by the adaptive dynamic refinement technique that we also 
present. After having introduced all methodology, in Sections 
and[^ we validate the performance of our method in various 
one-, two-, and three-dimensional test problems first without 
and then with self-gravity. Finally, we apply our method to a 
cosmological simulation with a cut-off in the initial spectrum 
in Section and demonstrate the advantages over previous 
approaches. We discuss our results and conclude in Section 


2 METHOD 

In what follows, we will present the theoretical foundations, as 
well as a practical implementation, of a new general class of 
numerical methods to solve the CBE referred to as ^^Lagrangian 
phase space elements”. 

We start in §2.1 by introducing the fundamental unit of 
our method: piecewise maps between Lagrangian space and 
Eulerian phase space. Then we discuss how this gives rise to 
piecewise density fields that are defined everywhere in space 
and can be made arbitrarily regular, but at the same time 
can contain caustics of various types. In §2.2 we focus on the 
errors associated to our scheme and how they propagate over 
time. In §2.3 we exploit this and show how the Lagrangian 
phase space elements can be adaptively split to track the full 
phase-space evolution of a fluid to a given accuracy. In §2.4 we 
discuss how our method allows an accurate and efficient calcu¬ 
lation of the evolution of a self-gravitating fluid. In the final 
section, §2.5, we further discuss the implementation of these 
ideas and parallelization strategies for an efficient execution in 
distributed-memory computers. 



Figure 1. Mapping between Lagrangian space and Eulerian config¬ 
uration space. The mapping can become multi-valued in projection 
to configuration space and can have singular points and curves. Here, 
as an example, the mapping of nine points forming a square in 
Lagrangian space are shown under a biquadratic map to phase-space 
and then projected into configuration space. 


2.1 Lagrangian phase space elements 


2.1.1 The Lagrangian manifold 

Let us begin by considering in more detail the evolution of the 
neighbourhood of a point with Lagrangian coordinate q G Q, 
which maps to Eulerian phase space as q (xq(t), Vq(t)) 
at some time t. An illustration for such a mapping in two 
dimensions is given in Eig. Shown is a piece of Lagrangian 
space, a square in this case, on which nine points are put down 
on a regular lattice. In Lagrangian space the density is uniform. 
This square is now mapped into four-dimensional phase space 
(where it is still a two-dimensional surface) and then projected 
back into two-dimensional configuration space. In the case 
shown, the map was given by some bi-quadratic polynomial. In 
general, the projection can lead to several points in Lagrangian 
space mapping to the same point in Eulerian configuration 
spac^ The projection becomes singular - corresponding to 
caustic subsets - where derivatives of the map to configuration 


space vanish (i.e. where Vq 


has at least one vanishing 


eigenvalue). The singularities arising from such projections have 
been studied in catastrophe theory in great detail (e.g. Arnold 
|et al.|19^ [Kidding et al.||2014[ for caustics of the Zel’dovich 
map in two and three dimensions). In what follows, we describe 
a procedure to produce a covering of the Lagrangian domain 
with such local maps. 

The differential nature of the cold distribution function 
guarantees that the space tangent to the sheet at q in phase 
space is uniquely defined as. 


TqQ : q 1-^ (Vq 0 Xq, Vq 0 Vq 


( 7 ) 


This tangent space evolves under its own set of equations, some¬ 
times called the “geodesic deviation equation” (GDE) because 
of the analogy to geodesics in curved spaces (jVogelsberger et al. 


2008 White fc Vogelsberger|2009 Vogelsberger fc White|20lT ) 
and the tetrahedral approximation is a secant approximation to 
this tangent spac^ Since the GDE gives us only infinitesimal 
information around a point, and the tetrahedral secants are 
very crude, a better local description of / in the neighbourhood 


^ The map to configuration space is thus in general surjective but 
not injective, while the one to phase space is bijective. 

^ For tetrahedra, the Vq are just given through the three directional 
finite differences that can be computed on a tetrahedron. 


© 0000 RAS, MNRAS 000 , 000-000 













Refined phase-spaee elements 5 


of q' can be achieved by considering the evolution of a higher 
order expansion of (xq, Vq) in the neighbourhood of q'. 


2.1.2 Lagrangian elements 


The central idea of our approach is to consider a decomposition 
of the Lagrangian manifold into a finite number of Lagrangian 
volume elements 3V. In the TetPM approach, these volume 
elements were tetrahedra, but in principle any volume decom¬ 
position is possible. In what follows we will use cubical volume 
elements to decompose Lagrangian space. Each element carries 
constant mass and defines a local map between Lagrangian 
space and Eulerian phase space. In particular, we will approxi¬ 
mate this map using multi-variate polynomials in what follows. 
We will show that the coefficients of these polynomials, each 
defined on its Lagrangian cubical element, can be represented 
by a number of supporting points (or flow tracer particles). 

For a three-dimensional Lagrangian manifold, we will thus 
consider the space of 3-variate polynomials Pk of order k in 
q = (^ 0 ,^ 1 ,^ 2 ), i.e. the set 


k 

-Pfe = {7r(q) I 7r(q) = Oc/St 9o O'!} > (8) 

a,/3,7=0 


and we note that dimP^ = (/c + 1)^. The Lagrangian envi¬ 
ronment 5V of our point q^ shall be described by a unit cube 
Ks = [0,1]^ centred on qb Matching dimPfe, we can choose 
the following subset of points C P3 on a regular lattice 


Efc = 


/ ^0 ii i 2 

V¥’ 1 


C Ks 


ij G 



(9) 


SO that the (k + 1)^ coefficients are fully determined if 

7r(b) is known for all {k -\- 1)^ elements b G E^. We also say 
that the element has {k-\- 1)^ degrees of freedom. Note that the 
requirement that E^ be a regular lattice is not a necessity but a 
mere convenience when calculating the polynomial coefficients. 
Furthermore, the lattice is “force-free” and gives a trivial equal 
mass decomposition. 

We are then able to express the mapping Ft of a point 
q and its environment between Lagrangian space and phase 
space through mappings of the unit cube centred on q as 
K 3 n(iG 3 ) C M®, i.e. specifically 


n: q '-S- (7rxo(q),7ra!i(q),7ra:2 (q) , 7r^;o (q)? 7r^;i (q), 7r^;2 (q)) (10) 


with all TT G Pfe and q G K 3 . 

Knowing position and velocity at point q alone without 
any neighbour information in the unit cube constrains only the 
0-order polynomial (/c = 0). This is comparable to the Wbody 
case, where also the continuous structure is not retained. In the 
TetPM method, four points are used. A tetrahedron represents 
thus the map q 1 -^ Aq + b, where A is a rank 2 tensor and b a 
vector, i.e. it describes a simple linear transformation between 
three dimensional Lagrangian and six-dimensional phase space. 
The case oi k — 1 corresponds to a trilinear element (made 
up from 8 points). In TetPM however, the unit cube was 
decomposed into six tetrahedra, so that one achieves six uni¬ 
linear elements per unit cube instead of one tri-linear element 
as would be the case at /c = 1 order in the approach discussed 
here. In this paper, we will mainly focus on the k — 2 tri¬ 
quadratic case, which is fully determined by knowing position 
and velocity at 3^ = 27 locations in the unit cube. Thus, for 
a system with a given number of degrees of freedom, fewer 
elements are needed when the order increases, since higher 
order elements have more internal degrees of freedom. 



Figure 2. Illustration of the deformation of a Lagrangian unit 
square represented by 3^ supporting points under a contraction 
inverse proportional to the distance from the centre. The square is 
mapped using a bi-linear polynomial map (a), a bi-quadratic map 
(b) and using a simple triangulation (c). 



Finally, without loss of generality, the approach discussed 
above for a single unit cube K 3 can be extended to all of by 
tiling R^ with such cubes and performing the mapping between 
Lagrangian space and phase space separately for every tile of 
Lagrangian space. For cosmological simulations with periodic 
boundary conditions one uses a finite tiling of the 3-torus. 


2.1.3 Densities from Lagrangian elements 

In general, the uniform density of a cubical element of La¬ 
grangian space is mapped to a non-uniform density curved 
patch in conhguration space. We note that the exact density 
of the mapped unit cube (of mass m) is obtained directly from 
the Jacobian of the configuration space part of the coordinate 
transformation J = Vq 0 Xq (i.e. Jij = ||^) as 

p = m |det = m/i/iy, (11) 

where the last equality just highlights the connection to dif¬ 
ferential geometry, i.e. the curved Lagrangian elements define 
a metric space with the metric G = J^J and g = det G, so 
that caustics correspond to ^ = 0. It is obvious then that 
tetrahedra possess a constant metric, while the higher order 
polynomial elements are of non-constant metric. Equivalently, 
density is constant for the tetrahedron, and the reciprocal of a 
polynomial for /c > 0, which means that elements with k^ 1 
can explicitly track caustics, while Wbody and TetPM can 
not (here caustics arise only when a tetrahedron has vanishing 
volume). 

In Figure we illustrate how a Lagrangian unit square 
maps to configuration space, and how the inferred density 
depends on the adopted order of the polynomial. In this two- 
dimensional example, we represent the Lagrangian unit square 
by 3 X 3 = 9 supporting points, and displace the Eulerian 
coordinates outwards proportional to the distance from the 
centre of the square. In case a., we show the result of adopting 
k — 1 bi-linear polynomials on four sub-squares tiling the full 
square. In case b., we show the result of adopting /c = 2 bi¬ 
quadratic polynomials on the full square. Finally, in case c., we 
triangulate the full square into eight triangles. We see that the 
mapped shaped in the two linear methods a and c are identical, 
but the inferred density is more accurately representing the 
radial gradient in the bi-linear case than in the tessellation case. 
The bi-quadratic method leads to parabolic edges and a smooth 
density distribution in the interior. The resulting density field 
(before shell-crossing) is piecewise linear in one direction for the 
tri-linear map, but generally not continuous. The tri-quadratic 
map, remedies this by allowing for gradients to change smoothly 
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Figure 3. Illustration of the difference between the density induced 
from the metric of a bilinear form (a), and that of a simple triangu¬ 
lation (b) of the same quad (1234) under a transformation in which 
the top two vertices (1 and 4) are flipped. The bilinear form leads 
to a non-convex projection with divergent density at the caustic 
point, while, naturally, the triangles remain convex, but have always 
constant density and become overlapping. 


inside of an element. It is always continuous but generally not 
continuously differentiable. Finally, triangular elements are 
piecewise constant which leads to a density field where many 
elements are needed to faithfully describe gradients. Given that 
all three examples use identical data, the advantage of higher 
order methods is clearly obvious in this case. 

We illustrate how polynomial elements trivially capture 
caustics in Fig. where, by a simple linear map to Eulerian 
space the top vertices of a square are flipped. In the left panel, 
a, the result for a tri-linear map is shown. Here, the mapped 
square has the correct topology and an actual singularity at the 
centre. This is quite in contrast to what can be achieved with 
a triangulation, which is shown in the right panels, b. Here, a 
choice must be made as to how to triangulate the square (in 
2D there are two possibilities). The linear map flipping the top 
vertices then maps each triangle again to a triangle of the same 
size, which leads to a final density field that has no singularity, 
but also the mapped square does not have the correct topology 
(nothing should have been mapped to the region of the dark 
grey triangle in the bottom right panel). Obviously, for N-body 
particles, the flipped and un-flipped state are indistinguishable. 
This anisotropy of a simple triangulation exists of course also 
in three dimensions. In order to avoid the intrinsic anisotropy 
inherent in the Delaunay triangulation of the unit cube, HAK13 
have proposed an alternative decomposition of the cube into 
eight overlapping tetrahedra instead of six (the decomposition 
they employed for the TCM scheme as opposed to the Delaunay 
that was used for the T4PM; Figure 3 in HAK13, where the 
TCM decomposition can be thought to be rearranged into 
one cube). In this paper, whenever we use tetrahedra, we thus 
employ the same non-tessellating double-covering of the unit 
cube as in HAK13 to avoid an anisotropic decomposition with 
tetrahedra. 


2.2 Time evolution, discretisation errors and 
refinement strategies 

Next, we will consider the time evolution of the Lagrangian 
elements. We will demonstrate that the evolution equations for 
the tri-polynomial elements become separable evolution equa¬ 
tions for the coefficients that can be solved by simply evolving 
the supporting points as freely falling flow tracer particles. The 
representation by finite order polynomials, however, introduces 
a truncation error that has to be bounded in order to maintain 
the Hamiltonian character of the system. This then leads to 
natural refinement criteria that attempt to keep the truncation 
error within reasonable limits and are necessary to guarantee 
e.g. energy conservation in general. 


2.2.1 Time evolution of elements 

The phase space density is, as in the Wbody case, conserved 
along the characteristics given by the canonical equations of 
motion (cf. eq. with the only difference that we are no longer 
concerned with a discrete set of characteristics but that they 
have an underlying manifold structure and thus generalise to 


Xq = Vq, and Vq = - Vx4>\^^ , with q G S (12) 

along which df /dt — t). 

The time evolution of the phase space elements can be 
obtained from the time derivative of eq. ( |1Q[ ), which contains 
six time derivatives of polynomials 7r(q). Since the Lagrangian 
coordinates q do not depend on time, we are left with time 
derivatives of the coefficients OocjS'y. Furthermore, consistency 
between the supporting points and the coefficients then requires 
that the coefficients themselves follow canonical equations, i.e. 


Xq ;/37 — 'VoifS'y 




;^7, 


(13) 


where Jij = dxijdqj is the Jacobian and f = Vq0 is the force 
mapped to Lagrangian space. It is completely equivalent to 
evolve the coefficients using their own canonical equations, 
truncated at some order, or using a number of freely falling 
test particles (the “flow tracers”, their number per element 
matching the degrees of freedom of the element) from which 
these coefficients can be calculated. This second possibility 
allows us to use a set of particles that are evolved like massless 
Wbody particles to describe the evolution of each element. 


2.2.2 Discretisation errors 

The time evolution above is exact if the series expansion is not 
truncated (i.e. for k ^ oo). When approximating the evolution 
equations above by piecewise polynomial expansions, a trun¬ 
cation error is however introduced since the force field across 
an element is only considered to the order of the polynomial 
expansion. The force across the element is in general however 
given as 

oo 

Fq= ^ (14) 

cx,f3,'y=0 

SO that the elements capture correctly the evolution of the first 
{k-\- 1)^ terms, and the error in the momentum update is given 
by 

oo 

Av = -J“^ E fc,/3-,qSqiqE (15) 

a,/3,7=fc-|-l 
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which implies that second order terms are sourced by a non¬ 
constant tidal field across the element, third order terms by a 
non-linearly changing tidal held and so on. This implies that 
momentum is only conserved at the level of this truncation 
error. The error is expected to be small when the potential 
is smooth across elements, which is usually true when the 
elements are small, but is certainly an invalid assumption at 
later times. Estimating the magnitude of terms of order k -\- 1 
and larger across elements thus naturally leads to a rehnement 
criterion, so that when an element is split into smaller elements, 
the truncation error remains bounded. We will elaborate on 
this possibility next. 

Refining on force: As we have just discussed, errors arise when 
the variation of the gravitational force across the element is not 
accurately captured. To turn this error source into a criterion 
for adaptive rehnement, we calculate the force at using two 
different orders of the interpolating polynomial, i.e. we calculate 


k k' 

Fq= C/37 9ogfg2 and Fq = 

a,/3,7=0 a,/3,7=0 

(16) 

with k' > k, then we can estimate the maximum relative force 
error and attempt to keep it bounded 

A/ max ||Fq - F^H / ||Fq|| < A/^ax/2^ (17) 


where ^ = 0,1,... is the rehnement level of the element. Specif¬ 
ically, in our implementation, we choose k' = 2k and chose 
not to determine the maximum but approximate the problem 
by evaluating A/ at half-points, i.e. at the locations where 
new supporting points would be placed if the element were 
to be rehned (cf. Fig. [^. E.g., for the tri-quadratic element, 
we compute the tri-quartic interpolant by simply combining 8 
tri-quartic elements together into a fundamental grid unit of 
5x5x5 supporting points which provides us with the necessary 
amount of points to compute the tri-quartic polynomial. 
Refining on high order derivatives: Alternatively, and similar 
in spirit to the rehnement criterion based on second derivatives 


proposed by Lohner et al. (1987), we can keep errors arising 


from derivatives of order k -\- 1 across our elements bounded 
by using a criterion that splits the Lagrangian cubes when the 
dimensionless ratio of derivatives of order k -\- 1 to derivatives 
of order k exceeds a specihed threshold. Since we will focus on 
the case of /c = 2 in the remainder, we use the ratio between 
third and second derivatives of the form 


?7 _ 1 | —|/z-2 + /z-l — /z+l + |/i+2 I 

^ 2^ \ — hfi-2 + fi-1 I + |—/z+1 + \ 


(18) 


F = 



+ |/z-l| + \ fi+l\ + 



where indices i run over the supporting points in Lagrangian 
space. Note that in three dimensions, we calculate the deriva¬ 
tives r]i and Si along dimension i and then evaluate the quantity 


n - f Vo + nf + 
\S^o+Si + Sl) 


(19) 


A Lagrangian element is selected for rehnement into smaller 
elements when © > ©max- We typically use e = 0.01 and 
thresholds are 0 < ©max < 1, where low values lead to more 
aggressive rehnement, but typically ©max — 0.1. 


• • • 

• • • 

• • • 

a. element is flagged for 
refinement 


b. positions and velocities are 
determined at mid-points 


c. new elements are created 
using the mid-point values 


Figure 4. The three steps followed in our oct-tree based rehnement 
of elements in Lagrangian space: Element hagging by a rehnement 
criterion (left), interpolation to mid-points (middle) and splitting of 
the element into eight elements of equal mass (right panel). 


Other rehnement criteria are of course possible and might 
possibly perform better than what we consider in this paper. 
What is the optimal criterion, i.e. minimises errors with the 
smallest number of hagged elements for a given problem, is an 
open question. It will thus be interesting to consider alternatives 
to the two presented above in the future. 


2.3 Adaptive refinement of elements 

We next discuss the adaptive splitting of Lagrangian elements. 
We adopt an oct-tree based rehnement strategy, meaning that a 
cubical element can be split into eight smaller cubical elements 
coextensive with the original element in Lagrangian space. 
Splitting this way in Lagrangian space automatically implies 
an equal-mass splitting and exact mass conservation. 

In order to illustrate the procedure we follow, we dis¬ 
cuss the steps involved using the Lagrangian element in two 
dimensions shown in Figure The steps followed during a 
rehnement step are then as follows (and trivially generalise to 
three dimensions): 

1. The rehnement criterion is evaluated for each element 
and if it is fulhlled, the element is hagged for subsequent 
rehnement (cf. left panel of Fig. |^. 

2. A regularisation step is performed to ensure that neigh¬ 
bouring elements differ by at most one level of rehnement. 
All elements that need to be rehned to maintain regularity 
are also hagged. 

3. For all hagged elements, new supporting points are inserted 
by evaluating eq. ( |10| at half-point locations (yellow points 
in middle panel of Fig. [^. 

4. The element is split into eight elements of l/8th the mass 
of the original element using the interpolated midpoints (cf. 
right panel of Fig. |^. 

The algorithm outlined above is guaranteed to provide a 
covering of Lagrangian space at all times (i.e. no holes will 
appear and mass is explicitly conserved), and changes in re¬ 
hnement level are constrained to be at most one between 
neighbouring cells. This procedure thus implements a tree-based 
AMR method in Lagrangian space. Since this AMR structure 
can be represented by an oct-tree, all operations on the tree can 
be encoded using integer arithmetic (we represent Lagrangian 
space coordinates by three 20bit integer numbers that can be 
stored in one 64bit integer) allowing for fast identihcation of 
neighbours without explicitly storing pointers. 

All how tracers are allowed to move freely, except those 
that occupy the faces between elements of different resolution. 
At these coarse-fine boundaries, flow tracers are constrained to 
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• • • 

• • 

• • • 









• • • • 

• • • 






• • • 


m=2 


171=3 


Figure 5. Mass deposition through oct-tree based splitting in 
Lagrangian space. The mass distribution of elements is approximated 
by recursively split mass carriers (black dots at one, two and three 
levels of recursion from left to right). Their position is determined 
using the local mapping between Lagrangian and Eulerian space 
which is determined from the supporting points (“flow tracers”, in 
blue). 


volumes, or 1 quadratic unit volume (cf. Fig where the 
blue points indicate the supporting points, or “flow tracers” 
and the shaded grey area the Lagrangian volume spanned 
by them). In contrast to the tetrahedral elements of TetPM 
(which have constant density), our tri-linear and tri-cubic (and 
all higher order) mapping of these elements results in a non- 
uniform density. Fortunately, their projection can be trivially 
approximated with a simple oct-tree based particle deposit 
scheme. Since the density in Lagrangian space is uniform, we 
perform an oct-tree subdivision of the Lagrangian unit cube 
Ks up to some level m that decomposes each element into 2^"^ 
discrete particles (or “mass carriers”) as illustrated in Fig. 
For every element, we have the (/c+1)^ supporting points Sfc, cf. 
eq. (it, that fully determine the mapping H, eq. |To| . We then 
represent the element of total mass m at approximation level 
m through a space partitioning into octs with mass ml2^^ 
and Lagrangian centres of mass at the locations Em C Ks 


remain in Lagrangian space at the mid points between coarser 
particles. This can be achieved by using an acceleration in¬ 
terpolated from the flow tracers shared between the coarse 
and the fine element to the half points instead of evaluating 
their actual acceleration from the gravitational force at their 
location. This ensures that also no holes appear when the 
Lagrangian manifold is mapped to Eulerian space. 


2.4 Gravitational Evolution 

We now describe how we compute the gravitational force field 
sourced by the Lagrangian elements and discuss how to follow 
their time evolution. We will discuss how we compute the total 
density field p(x) = X)ePe(x), where e indexes all Lagrangian 
elements. Having an estimate of the total density field, it is then 
possible to solve Poisson’s equation Vx0 = 47rGa“^ {g — g) 
and compute the gravitational accelerations. 


2.4-i Approximating the projected mass distribution of 
Lagrangian phase space elements 

A basic element consists, in general, of a curved hexahedron 
whose inner density field is described by a rational function. 
In principle it may be possible to directly compute in some 
simple cases the force exerted by these geometrical objects. In 
practice, however, this would be computationally extremely 
expensive (if possible at all). A more practical alternative is 
to project the Lagrangian elements onto a regular mesh in 
Eulerian configuration space and compute the forces by solving 
Poisson’s equation using Fourier or relaxation methods. 

Choosing such an approach, the elements need to be de¬ 
posited onto a grid. In general there is no known method to 
integrate an arbitrary rational function over the intersection of 
curved hexahedra and cubes (i.e. grid cells). An alternative is 
to approximate them by polyhedra and then decompose them 
into a set of tetrahedra, for which an exact deposit is possible 
( Powell Abel|[2015 ). Another alternative, which we adopt 
here, is to recursively decompose, in Lagrangian space, the 
mass of the hexahedra into equal-mass particles located at the 
centre of mass of each subvolume. This set of mass carriers is 


then mapped to Eulerian space using eq. (10) and deposited 


onto the mesh as point masses. We describe this approach in 
more detail below. 

Our cubical Lagrangian elements are fully described by 
a set of 3^ points in n dimensions, spanning 2'^ linear unit 


/ 2zo — 1 2ii — 1 2^2 — 1 

I gm-j-l ’ gm-j-l ’ gm-j-l 


h 



( 20 ) 


The mapping to Eulerian space for all points in Em is then 
given by simply applying the mapping H, i.e. n(Sm), that has 
been determined using the supporting points E^. In practice, 
since the points at which H needs to be evaluated are given by 
the simple oct-tree structure, the coefficients of the 3-variate 
polynomials can be pre-calculated. 

One apparent disadvantage of our approach is that mapped 
particles have a deformed grid regularity. For completeness, we 
mention three alternatives to deposit particles at the end of 
the recursive decomposition. 


i. A very smooth deposition could be achieved by triangu¬ 
lating the final oct and depositing the resulting tetrahedra 


using an exact deposition algorithm (cf. Powell & Abel 


2015). 


ii. A glass particle distribution (cf. |White|p^996 ) can be 
used to map the elements to Eulerian space suppressing 
possible Moire between the density mesh and the deposited 
particles (i.e. Em is replaced with a glass distribution in 
[ 0 , 1 ]). 

iii. A random sampling is also possible, where Em is replaced 
with an arbitrary number of uniformly distributed random 
points drawn from [0,1]. This approach is not suitable to 
sample the mass distribution in time evolving simulations 
since a single random realisation is not force free (in con¬ 
trast to the glass and the hierarchical lattice). However, 
this approach is favourable for the creation of images since 
Poisson sampling gives the visual impression of a smoother 
field than the other approaches using the same number of 
sampling points. 


All of these three alternatives result into no or fewer symme¬ 
tries, suppressing possible Moire between the density mesh 
and the deposited particles. Currently, we have successfully 
implemented ii) and iii) in our code. However, we have tested 
that this does not affect our results in practice, thus we will 
not employ the simple grid deformation in the reminder of the 
paper. 


2.4-2 Force Calculation and Time Stepping 

The particles in Em can deposited using any of the known 
particle-mesh charge assignment schemes (cf. [Hockney fc East~ 
wood} 19^ . We have adopted the cloud-in-cell (CIC) deposit in 
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this paper for the pseudo-particles 'Em determined as described 
above. Once the density field is obtained, we solve Poisson’s 
equation using a standard Fourier method. We first perform a 
Fast Fourier Transform of the field, then compute the gravi¬ 
tational potential by multiplying with Green’s function for a 
fourth order finite-difference Laplacian, i.e. 

G{k) — [cos(2/ccc) — 16cos(A;x) + 15 

+ cos(2— 16 cos(/c^) + 15 (21) 

+ cos{2 kz) — 16cos{kz) + 15] / 6. 

We then perform the backward Fourier transform and apply a 
4th order finite difference gradient to compute the gravitational 
force field. We finally interpolate this field to the position of the 
supporting points and advance their positions and velocities 
just as would be done for standard Wbody particles. 

We note that our implementation of the force algorithms 
do not require all mass carriers to be concurrent in memory. 
This is a great advantage in keeping the memory footprint of 
our code low since the ratio of fiow to mass tracers will be 
small. 

Once forces are computed, we adopt a leapfrog kick-drift- 
kick time integration scheme. Since our forces are given by a 
PM algorithm, we employ a global timestep set by the minimum 
of the timestep assigned to fiow tracers, which in turn is set 
by a Courant criterion: 


The first is the one-time construction of Lagrangian 
patches from a set of simulation particles. For this, we start by 
performing a decomposition of the computational domain using 
Lagrangian (as opposed to Eulerian) coordinates. This makes 
most Lagrangian patches to have locally all their supporting 
points. For those lying in the surface of the domain boundary 
this is not true and they will need to communicate with other 
processors and import the required set of vertex. This can be 
done in a very efficient manner by exploiting the fact that 
there is a unique relation between a particle ID, its Lagrangian 
patch, and the MPI task hosting it. 

The second aspect is that our fundamental data struc¬ 
ture are Lagrangian patches, on which an Eulerian domain 
decomposition is performed. We keep a locally complete set of 
supporting points for these structures, which implies a slight 
overhead in terms of memory (since the boundary supporting 
points are duplicated) and slight suboptimal work/load balance. 
However, our choice has the huge advantage of allowing all 
quantities related to Lagrangian patches to be computed locally, 
without requiring any inter-node communication. This makes 
the creation of mass carriers embarrassingly parallel, while 
their subsequent deposit to the mesh requires communication. 


3 TESTING ADAPTIVE REFINEMENT: PHASE 
MIXING IN A FIXED POTENTIAL 


t 


min 


Ui 


( 22 ) 


where Ax is the EET cell size, Ui is the magnitude of the 
velocity for the i-th particle, and (7 is a parameter which we 
usually set to somewhat less than unity. 

For completeness, we mention that we have also imple¬ 
mented an oct-tree algorithm, which allows us to have a very 
high force resolution and also the implementation of individual 
time steps for every fiow tracer. The idea is to build a full 
oct-tree down to some specified node length, and then use 
fiow tracers to refine the tree topology in a given simulation. 
Then, the moments of the multipole expansion of the tree force 
of each node are computed using the mass field given by all 
mass carriers. In the remainder of this paper we, however, only 
employ forces computed by the PM algorithm to avoid possible 
artefacts due to spatially-correlated force errors introduced by 
the use of finite terms in the multipole expansion of the force 
field. 


2.5 Implementation and Parallelisation Strategies 


Interesting problems in cosmological structure formation re¬ 
quire a full exploitation of state-of-the art supercomputer cen¬ 
tres. This in turn requires algorithms capable of being efficiently 
executed on up to thousands of processor cores with a mixture 
of shared and distributed memory, and for billions of resolu¬ 
tion elements. Thus, we have put particular emphasis on these 
aspects in our code. 

We have implemented our algorithms in the distributed- 
memory parallel code L-Gadget3. The L-Gadget3 code is a 
memory efficient version of P-Gadget3 ( Springel|2005| Springel 
et al. poosj i specially tailored for dark matter only simulations 
and that was successfully executed on 12,000 computer cores 
for the MXXL project ( Angulo et al.|201^ . Most of the parallel 
algorithms can be directly used for our Lagrangian element 
method, but there are two aspects noteworthy. 


In a first test problem, we evolve a cube of cold collisionless 
material without self-gravity in a fixed potential. This will allow 
us to study the general feasibility of adaptive refinement as well 
as the impact of the refinement criterion on the accuracy of the 
subsequent evolution. After this, in the subsequent sections, 
we then turn to test problems with self-gravity in one and two 
dimensions. 

As discussed in HAK13, the most important shortcoming 
of TetPM is the inability to follow the dark matter sheet when 
strong mixing occurs (e.g. in the inner parts of haloes). In this 
case, the surface of the sheet in phase space grows rapidly (see 
also below) by being continuously stretched, and a given small 
number of points is insufficient to describe its evolution over 
longer times. New points need to be inserted in order to follow 
the motion of the stretched regions accurately. 

To provide a robust and simplified framework to test the 
performance in mixing situations, we study a cube K of col¬ 
lisionless, cold but not self-gravitating material orbiting in 
a non-harmonic potential. The cube is initially represented 
by 5^ particles spanning 8x4^ tetrahedral, 4^ tri-linear or 8 
tri-quadratic elements respectively. For comparison, we will 
compare to the high-resolution A-body solution in which we 
sample the cube by 128^ particles. The cube is given an initial 
uniform velocity that is offset from the centre of the potential 
(i.e. every point in it has non-zero angular momentum). Specif¬ 
ically, we consider a constant Plummer gravitational potential 
for which the acceleration field is given by 






3/2 ■ 


(23) 


with some a and e whose specific values are irrelevant in our 
case as we only want to study the qualitative behaviour. Since 
the potential is constant in time, the motion of any point 
q G A in phase space is determined uniquely by its initial 
coordinates in phase space (xq(0), Vq(0)). Hence, the choice of 
the Lagrangian elements used to represent the orbiting cube 
has no impact on the dynamics of the vertices. The order of 
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t=6 


t=12 


i. tetrahedra 



ii. tri-linear 

^ 


2|- ' ' ' ' I ' ' ' ' I ' ' ' ' I ' ' ' ' I ■ ' ' ■ -I 



. . . 

0 2.5 5 7.5 10 12.5 

time [arbitrary units] 


iii. tri-quadratic 



iv. high-res N-body 



Figure 6. Images of a uniform cube phase-mixing in a Plummer 
potential at t = 6 and at t = 12. The top three panels use the exact 
same 125 supporting points but differ in the order of the Lagrangian 
elements determined by them (i.e. they have the same degrees of 
freedom, but differ in the number of elements). The bottom row 
shows the result of sampling the cube with more than 2 million 
N-body particles. All Lagrangian element schemes diffuse mass 
towards the center, but the loss of energy and resulting ’overmixing’ 
is smallest for the tri-quadratic elements (as quantified in Fig. [^. 


the elements will however impact the distribution of matter 
between the vertices, as well as where new vertices are inserted 
if adaptive refinement is used. This simple test problem will 
thus allow us to study a few key diagnostics: (1) how the order 
of the Lagrangian elements impacts where mass is deposited 
over time and how well energy is conserved for the elements, 
(2) which refinement criteria allow to follow the mixing well 
and allow an energy conserving scheme, as well as (3) which 
refinement criteria yield a better tracking of the evolving phase 
space sheet with the smallest number of newly inserted vertices. 


3.1 Phase mixing without refinement 

We first demonstrate once again that in mixing situations the 
Lagrangian element schemes lead to mass deposit towards 
the centre of the potential, leading to a violation of energy 


Figure 7. Evolution of the total energy (top) and volume (bottom 
panel) of a uniform density cube orbiting in a static Plummer po¬ 
tential, leading to phase mixing. The elements are not adaptively 
refined in this test, i.e. the degrees of freedom in the system are 
kept constant. In both panels we show results for tetrahedral (dot¬ 
ted blue), tri-linear (dashed yellow) and tri-quadratic (solid red) 
Lagrangian elements. The significantly higher volume for the tetra¬ 
hedra is spurious and should be divided by ~ 2 at late times (see 
text for details). 


conservation. In Figure we show the sheared cube at two 
times t = 6 and t = 12 (left and right columns, respectively) 
for all schemes: tetrahedral, tri-linear, tri-quadratic Lagrangian 
elements based on only 5^ supporting points to describe the 
initial cube as well as the respective Wbody solution using 
128^ particles to sample the initial cube (from top to bottom). 
We note that generally higher order schemes, as expected, 
help to represent the true mass distribution more faithfully, 
with the tri-quadratic scheme providing a reasonably good 
description at t = 6, where no mass is deposited into disallowed 
orbits close to the centre. This is not the case for the linear 
schemes where significant diffusion towards the centre has 
occurred already byt = 6. Att = 12, the situation is even 
more dramatic, as all schemes now deposit mass into the very 
centre of the potential, with the tetrahedra even assigning 
most mass to the central region. This is exactly the situation 
that lead to the biased central densities of haloes discussed 
by HAK13. While the curved elements help to some degree 
to faithfully describe the wrapping of the sheet, it is obvious 
that a refinement technique is needed to fully resolve diffusion 
towards the centres of potentials and thus maintain energy 
conservation. 

To quantify the visual impression gained from Fig. we 
calculate the total energy of the cube over time 


Ek 



d\. 


(24) 


which can be particularly easily evaluated for an analytic 
potential and is conserved if the potential is static as in our case. 
Furthermore, the degree of sheet wrapping can be quantified 
by the total volume of the cube as given by 



(25) 
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0 = 0.1 


t=12 


0=0.05 


i. tetrahedra 



iii. tri-quadratic 



Figure 8. Images of the mixing cube from Figureat t = 12 with 
dynamical adaptive refinement using the criterion from eq. ( |19| ). 
Dynamical refinement increases the number of degrees of freedom 
available to the system (beginning from the same 125 as in Fig. [^. 
Thanks to the adaptively inserted vertices, all schemes now repro¬ 
duce the mass distribution at t = 12 reasonably well, with higher 
order schemes still performing better. Energy is increasingly better 
conserved (as quantified in Fig. |^, for the tri-quadratic elements to 
better than 0.1 per cent. See text for a detailed discussion. 


In Figure we show the evolution of E and V with respect to 
their initial values. We find that at early times, the tetrahedral 
and tri-linear elements lose energy at the same rate, for t >7, 
the tetrahedra lose somewhat less than the tri-linear. At all 
times, the tri-quadratic scheme conserves energy significantly 
better, but still loses energy. The sheet volume is identical 
for all methods for t < 2.5 and disagrees after, with tri-linear 
and tri-quadratic roughly following each other and tetrahedra 
estimating a significantly larger volume (roughly a factor of 
two). This is however simply due to the fact that the tetrahedral 
decomposition is not an actual tessellation to avoid intrinsic 
anisotropies (cf. the discussion in Section 2.1.3). It is obvious 
from these results, that adaptive refinement of the elements will 
be required to both conserve energy and arrive at a converged 
answer for the volume of the sheet. 



time [arbitrary units] 


Figure 9. Evolution of the total energy (top) and volume (bottom 
panel) of the uniform density cube orbiting in a static Plummer 
potential with dynamical adaptive refinement (to be compared with 
Fig. [^that shows the same results without refinement). For each of 
the element types, we show the evolution for various choices of the 
refinement threshold parameter 6 (c.f. eq. |19| ). With the tri-quadratic 
elements energy is well conserved to a degree controllable by the 
refinement threshold 6. 


cantly larger amount of refinement was needed to arrive at a 
solution of comparable qualit}j^ 

First, we will now investigate the impact of the order of 
the element as well as the threshold used in the refinement 
criterion on the evolution of the cube. In Figure we show the 
distribution of mass at time t — 12 (to be compared to the right 
column of Fig.[^ for the three Lagrangian element types as well 
as refinement thresholds of ^ = 0.1 and 0 = 0.05. We note that 
adaptive refinement has significantly improved the agreement 
with the density distribution of the iV-body reference solution 
for all element types with increasing agreement, as expected, 
with higher order and smaller thresholds 0. In fact, the tri¬ 
quadratic solution with threshold 0.05 outshines the reference 
solution which is grainy due to the Wbody sampling, while 
both use about two million points to represent the solution. For 
0 — 0.05 the lower order elements refined to at least this number 
of vertices, while still suffering from slight density fluctuations. 
For ^ = 0.1, all elements refined to make use of a few hundred 
thousand vertices to represent the solution. We complement 
this visual inspection with a quantitative study of the total 


3.2 Phase mixing with refinement 


We now allow for adaptive refinement using the refinement 
criterion based on limiting high-order polynomial contributions 
to the represented velocity field as expressed in equation (19). 
We have tested also refining on forces, but found that a signifi- 


^ This can be understood simply from the difference in location w.r.t. 
the potential centre where the respective criteria trigger refinement. 
The velocity refinement inserts new vertices preferentially at large 
radii where large differences in velocity for an element occur first. 
Quite in contrast to this, the force refinement will split the innermost 
elements first since they experience stronger force gradients. The 
latter proved to be less optimal in this case. 
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Figure 11. Volume over time of the mixing cube. The curves 
for tetrahedra (dotted blue) and tri-linear elements (yellow short- 
dashed) are identical to those in Figure]^ but shown over a longer 
time interval. In addition, we use 2nd order (red long-dashed) or 
4th order (solid red) tri-polynomials when inserting new Lagrangian 
vertices. 


energy and the sheet volume in Figure]^ In all cases (we also 
show results for 0 = 0.2 in this figure), energy conservation 
is greatly improved, but remains only for the tri-quadratic 
elements at the sub-percent level by t = 12. Interestingly, the 
different elements seem to converge to different asymptotic 
curves for the evolution of the sheet volume when the threshold 
0 is reduced. The sheet volume is a very sensitive diagnostic 
since all perturbations and numerical errors will enhance the 
sheet growth rate. The increase in sheet volume is entirely 
due to the density inhomogeneities visible in Figure for the 
tetrahedral and linear elements. While it may be possible to 
find refinement criteria more suitable to the low order elements, 
we conclude at this stage that the tri-quadratic elements are 
significantly improving energy conservation and quality of 
the solution with a significantly smaller number of necessary 
vertices to be inserted via adaptive refinement. 

We push this observation even further and will next use 
a tri-quartic, i.e. 4th order interpolant over 8 tri-quadratic 
elements when inserting new vertices. We still represent the 
elements only by tri-quadratic maps. This allows to improve the 
quality of the solution even further. In Figure^] we show the 
mass distribution of the cube at a twice later time than before, 
i.e. at t = 24. By this time, the inner regions have already mixed 
very substantially. In all cases we use a refinement threshold of 
0 = 0.1. The left panel shows tri-quadratic elements with new 
vertices inserted using a second order interpolant, while the 
middle panel uses a fourth order interpolant and the right panel 
shows the Wbody reference solution. Using the fourth order 
interpolation improves energy conservation by a factor of 4 
and once again improves the solution. Furthermore, at t = 24, 
using the second order vertex insertion leads to ~ 750'000 
vertices in total, fourth order insertion requires only ~ 500^000, 
which is only 1/4 of the particles of the Wbody reference run. 
The fourth order interpolation also significantly improves the 
evolution of the sheet volume (see Figure pT]), where noise due 
to the second order interpolant makes the sheet grow faster 
than it should at very late times. 

We can conclude from the analysis in this section that 
adaptive refinement, especially when combined with high order 



PM force res. 


Figure 12. Error in particle displacement in the plane wave collapse 
problem at shell-crossing as a function of PM force resolution for 
A/’-body (stars) and quadratic elements. Both methods use the same 
number of particles and supporting points (32^, which results in 
16^ quadratic elements). The mass distribution of the quadratic 
elements is sampled at two different resolutions (squares and circles). 
See Hr] for a discussion. 


schemes for mass assignment and refinement can make the 
Lagrangian phase space element approach energy conserving 
while overcoming the inherent graininess of the N-hody method 
with significantly fewer active particles. 


4 TESTING SELF-GRAVITATING 
LAGRANGIAN ELEMENTS 

We have seen in the previous section that adaptive refinement 
can exquisitely track the evolution of the phase space sheet 
even in situations of strong mixing. This test was however 
performed in an external potential meaning that local errors in 
the solution did not feed back into the global solution. We will 
thus investigate in the subsequent sections the performance of 
the adaptively refined elements when self-gravity is present. 


4.1 Test 1: One-dimensional Plane Wave collapse 
(axis-aligned) 


We will only briefly consider the axis-aligned one-dimensional 
plane wave collapse here, before we move on to computationally 
more challenging problems. This problem consists in solving 
the self-gravitating collapse, in an Einstein-de Sitter cosmology, 
of a one-dimensional potential perturbation 

0(q) = ^ cos(kp • q), (26) 


where kp is parallel to one of the cartesian axes of the compu¬ 
tational volume, ||kp|| = 27r/L and A is chosen so that shell 
crossing occurs at an expansion factor of across- 

For the plane wave, the solution (and thus particle position, 
velocity and acceleration) can be calculated analytically before 
shell-crossing occurs ( Zeldovich|1970| ) but the analytic solution, 
of course, relies on “unsmoothed” gravity. In the case that force 
softening goes to zero (and thus the PM resolution to infinity), 
the analytic solution should be approached. We thus present 
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t=24, 0=0.1 


i. quadratic interpolant for refinement 


ii. quartic interpolant for refinement 


iii. high-res N-body solution 





Figure 10. The mixing cube at a much later time t = 24 for the tri-quadratic scheme but using different orders of interpolation when 
inserting new vertices triggered when 0 > 0.1. The left panel shows the final mass distribution if new vertices are inserted with the same 
order as the element (i.e. tri-quadratic), while the middle panel shows the result if vertices are inserted by combining 8 elements and using 
a tri-quartic interpolant. The right panel shows the distribution of the 2 million A/’-body particles at t = 24. The relative error in energy 
conservation using the tri-quadratic interpolant is 0.2 per cent at t = 24, while it is 0.05 per cent for the tri-quartic interpolant. See text for 
detailed discussion. 


a plot identical to what is shown in the top panel of Figure 7 
of HAK13 in Figure the position error of flow tracers at 
the time of shell-crossing as a function of the force resolution 
while keeping the number of flow tracers constant. The solution 
should converge to a sharper and sharper caustic with increas¬ 
ing force resolution. As demonstrated by HAK13, A'-body does 
not converge at fixed particle number. Here, we show the re¬ 
sults for standard A-body and for tri-quadratic phase space 
elements without adaptive refinement. Unlike HAK13, where 
no true convergence at fixed tracer particle number was seen, 
we see that the recursive mass deposit improves the situation 
substantially, leading to close to second order convergence to 
the analytical solution with a constant number of tracer par¬ 
ticles! For a given deposit recursion level, the error w.r.t. the 
analytical solution always asymptotes to a fixed value. To get 
true second order convergence, either exact deposits have to 
be employed (which have been implemented e.g. for tetrahedra 
by Powell fc Abel|2015 ), or an optimally chosen deposit level 
for a given force resolution has to be employed. 


4.2 Test 2: One-dimensional Plane Wave collapse 
(oblique) 

We next consider a single plane wave that is not aligned with the 
Cartesian axes, but has a wave vector kp = (2, 3, 5)/c/, where 
kf — 27r/L is the fundamental mode of the simulation box. 
This is a very hard problem for particle methods since the one¬ 
dimensional wave motion has to be preserved in three dimension 
(cf. [Melott et al.||1997 ). The particular choice of kp leads to 
the particles or flow tracers being folded onto themselves at 
rationals of the box-length. In Figure we show the phase 
space along the wave vector for the new phase space element 
method (using tri-linear and tri-quadratic elements without 
adaptive refinement) as well as the standard N-body method. 
We employed a force resolution of 256^ and 32^ tracer/N-body 
particles. The mass distribution of the phase space elements was 
sampled at two deposit levels. We compare the results to a high- 
resolution one-dimensional solution (using 16, 000 particles) 
at the same force resolution at two times corresponding to 
a = across (top panel) as well as a = ducross (bottom panel). 

The results are consistent with what HAK13 found (the 



xk/2n 


Figure 13. Collapse of a plane wave not aligned with the initial 
particle lattice. The phase space projected along the wave vector 
is shown at shell-crossing for standard A-body (black stars), the 
tri-linear scheme (red squares) and the tri-quadratic scheme (blue 
circles). The black line indicates a very high-resolution ID solution 
at identical force-resolution. In the top panel, the wave is shown 
at shell-crossing (a = Uc), in the bottom panel at a = 4ac. Note 
that points have been re-mapped to the unit interval of the wave 
mode so that points that appear as neighbours in phase-space are 
not neighbours but spread throughout the computational box. 


results in their Figure 11 are however shown at somewhat later 
times) in that the A-body solution differs widely from the true 
one-dimensional solution. Here we see that this discrepancy 
already exists at shell-crossing, but becomes most severe at 
later times, where substantial velocities outside of the plane of 
symmetry are found to break momentum conservation inside of 
the symmetry plane. Quite differently for the tri-linear and tri¬ 
quadratic phase space elements. Using the same number of 32^ 
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a. N-body 323 b. high-res N-body 5123 



Figure 14. Impact of the order of the Lagrangian elements on the solution without refinement: the “ripple wave” test problem using 32^ flow 
tracers and PM force resolution of 256^ cells for standard N — body using 32^ particles (panel a), using 8 x 32^ tetrahedral elements (panel c), 
32^ tri-linear elements (panel d) and 16^ tri-quadratic elements (panel e). These solutions are compared to a high-resolution A/'-body run 
employing 512^ particles (top right, panel b) at the same force resolution. The degrees of freedom of the system in panels a,c ,d and e are thus 
the same, while panel b has 16^ more. A rendering of panel a, using tri-quadratic elements a posteriori is shown in Fig. |l6| 


tracer particles as the A^-body solution, they have substantially 
smaller errors at shell crossing and trace the correct solution 
still very well at much later times when a = ducross, in fact, 
the tri-quadratic solution is right on top of the high-resolution 
solution. This is even more remarkably since the way we plot 
the solution makes it appear that the resolution is higher than it 
actually is: points that appear as neighbours in phase-space in 
this plot are not actually neighbours in Lagrangian space. They 
are points at the respective phase of the wave but at rather 
different locations on the Lagrangian manifold and therefore 
spread out throughout the entire simulation box. 


4.3 Test 3: two-dimensional perturbed plane wave 
collapse 


In a hnal test problem, we employ the same two-dimensional 


test as in Hahn et al. (2013): the anti-symmetrically perturbed 


wave described by Valinia et al. (1997). Compared to the 


plane wave collapse above, this problem introduces anisotropic 
collapse as well as mixing and is thus a signihcantly harder 
problem. In this test problem, the Lagrangian gravitational 
potential is given by that of a plane wave in x-dimension with 
a sinusoidal phase perturbation in the y-dimension 


(/)(x) = 0COS ( kp 


I 7 

X + CaTW COS kaV 

kl 


(27) 


The initial particle positions and velocities have been obtained 
using the Zel’dovich approximation at z = 19. We adopt kp = 
27^ka = 27t/L and Ca = 0.5, where L = 10/i“^Mpc is the 
size of the simulation box (note that this is different from the 
parameters chosen in HAK13), and 0 is chosen so that first shell 
crossing occurs at an expansion factor of ac = 1/7.7 ~ 0.13. 
We run this problem in an Einstein-de Sitter cosmology and 
discuss the results at a = 0.5. In all cases, unless otherwise 


specified, we use initial conditions for 32^ particles and employ 
a force resolution of 256^ PM cells. All solutions are run in 
three dimensions, effectively replicating the problem along the 
^-axis. The initial density fields for this test problem using the 
particle representation as well as the elements of various orders 
can be seen in FigureAs already discussed in Section [2. 1.3| 
the tetrahedral elements provide a piecewise constant, while 
the tri-quadratic provide a smoother continuous density field. 


In Figure we present the solution for this test problem 
using a constant number of elements of increasing order, from 
top to bottom: A^-body, then tetrahedra, trilinear elements, and 
tri-quadratic elements, and a reference high-resolution A/’-body 
run using 512^ particles at the same PM resolution. We see that 
all Lagrangian element methods recover a roughly identical 
structure that is only barely inferred visually from the low- 
resolution A^-body solution. Furthermore, we clearly see how 
the increasing order of the elements improves the smoothness 
of the resulting density field (as shown in Section 2.1.3). The 
most important results to be taken from this hgure are the 
central density of the clump and the position of the caustics. 
HAK13 have found that the self-gravitating tetrahedra can 
lead to a biased central density and also bias in the positions 
of caustics. We see these shortcomings reproduced in this 
hgure for both tetrahedra and tri-linear elements. The only 
advantage of the tri-linear elements is that the projected density 
is somewhat smoother. As expected, both the central density 
and the positions of caustics (compared to the high-resolution 
A^-body solution) are, at least visually, signihcantly improved 
when the tri-quadratic elements are used. 

Clearly, the density held inferred from the tri-quadratic 
elements is signihcantly smoother than that from the linear 
element types. One might thus wonder whether A/’-body actu¬ 
ally does well but looks bad in comparison. To illustrate that 
this is clearly not the case, in Fig. |16[ we show the density 
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Figure 15. The initial conditions for the “ripple-wave” test problem 
(cf. Sec. |4.2| >. Shown are the particle locations (panel a), the density 
field using the tetrahedral phase space elements (panel b), using 
tri-linear elements (panel c) and using tri-quadratic elements (panel 
d). The linear elements are discontinuous at element boundaries, 
while the quadratic is continuous. 



tri-quadratic reconstructed from N-body 323 


tri-quadratic 323 self-consistent 


Figure 16. Comparison between a reconstruction of the tri¬ 
quadratic density field from the 32^ standard A/^-body run (top 
half-panel) and the self-consistent evolution of the tri-quadratic 
elements (bottom half-panel). One clearly sees that A/’-body particle 
noise significantly perturbs the solution, in particular, caustics are 
not persistent. 


field using tri-quadratic elements, determined a posteriori from 
the evolved low-resolution A/’-body solution (top half-panel), in 
comparison to the self-consistent tri-quadratic element solution 
(bottom half-panel). Although the quadratic elements provide 
a smooth field, the A/’-body estimate show a very noisy solu¬ 
tion, where “noisy” now means that caustics are significantly 
perturbed compared to the much smoother self-consistent so¬ 
lution. The smooth field is thus clearly not an artefact of the 
rendering, but demonstrates convincingly the superiority of 
the Lagrangian elements over the A/’-body solution! 

Naturally, the big remaining question is whether dynamical 
adaptive refinement in such a self-gravitating system with 
mixing converges to the right solution. We show the solution 
using refinement in Figure Ell comparing once more against 
the 512^ particle high-res A/’-body solution at the same force 
resolution. We only consider the tri-quadratic elements in this 
case, although the linear elements also perform reasonably well. 


We started with the same 32^ initial conditions as in the fixed 
resolution test shown in Figure but now employed the force 
refinement criterion with a threshold of 0.1 to dynamically 
split elements if required (the results using velocity refinement 
are however not significantly different). The solution allowing 
for one additional level of refinement is shown in the top panel, 
the one for two levels in the middle panel, and the reference 
A^-body solution at the bottom. Rather strikingly, the solutions 
quickly converge to the reference solution in the exact shape 
and position of caustics. Already with one additional level, the 
central density of the clump is comparable to the reference 
solution. We do not perform a more quantitative solution of 
these toy problems but let the images speak for themselves 
and perform a quantitative convergence study of refinement 
in the next section, where we apply the Lagrangian element 
method to cosmological structure formation. 


5 A FIRST APPLICATION: COSMOLOGICAL 

SIMULATION OF A WARM DM UNIVERSE 

We now apply our Lagrangian phase space element method to a 
cosmological problem. We simulate the gravitational evolution 
of a L=20 Mpc/h cube in a universe where dark matter is 
made of warm particles of mass mdm = 250 eV, leading to a 
small-scale cut-off in the density perturbation spectrum. 

The cosmological parameters we employ correspond to 
those favoured by the WMAP7 data release measurements 
( Komatsu et al.||2011 ). Explicitly: Qm = 0.276, Qa = 0.724, 
Qb = 0.045, h = 0.703, ag = 0.811 and spectral index Us = 0.96. 


We generate our initial conditions using the MUSIC code (Hahn 
fc Abel||2011 ), with an input linear density power spectrum 


given by the fitting formulae of Eisenstein Sz Hu (1999) and 
a small-scale cutoff that mimicks the effects of a warm dark 
matter particle (see Bode et al.||2001 Angulo et al.|2013 for 
more details). 

We follow the evolution of 16^ Lagrangian patches, each 
one of mass 1.49 x 10^^ which are created from a 

set of 64^ particles. Gravitational forces are computed using 
a PM method with a grid of 512^ cells, which implies that 
forces are effectively softened below 80 kpc/h. We note that the 
dark matter particle mass, cosmological parameters, simulation 
particle mass and force resolution match those of | Angulo et al.| 
(20T3l. 


In the following, we compare the results of simulating the 
above configuration with 5 different methods; i) a standard 
A/’-body method, with 64^ particles, ii) the tetrahedral method 


of Hahn et al. (2013) and Angulo et al. (2013), with a total 
of 8 X 32'^ tetrahedra, iii) 32'^ tri-linear elements, iv) 16^ tri¬ 
quadratic elements, and v) 16^ tri-quadratic elements with 
1 or 3 levels of dynamic adaptive refinement allowed. They 
all employ numerically identical initial conditions and simply 
group particles together as vertices to define the elements. 

The simulations that feature our Lagrangian element 
method were run with 2 levels of recursion in their mass de¬ 
posit. This means that the density field associated to a given 
Lagrangian patch was represented with 42496 bodies, sum¬ 
ming up to 174063616 in the entire simulation volume. The 
simulation that was allowed to refine up to 3 levels created 
302737 new resolution elements by z = 1. At this redshift, we 
find only 18 patches that have not triggered refinement, 19794, 
76885, and 206040 at refinement levels 1,2, and 3, respectively. 
We recall that at the highest refinement levels, particle have a 
mass 8^ times smaller, i.e rup = 2.8 x 10® h~^MQ. 
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Figure 17. The ripple wave collapse test with dynamic adaptive refinement. The 32® runs use the same initial conditions as in Fig. |l4| 
tri-quadratic elements and one (left, panel a), and two (middle, panel b) of dynamic adaptive refinement. The right panel shows again the 
solution of a high-resolution A/’-body run using 512^ particles at the same 256^ PM force resolution. One clearly sees how adding more 
supporting points approaches the high-resolution A^-body solution. Still, the left two panels have significantly fewer degrees of freedom than 
the A/'-body run. 
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Figure 18. A Warm DM universe at 2 ; = 1 simulated employing the adaptively refined phase-space element method presented in this paper. 
We use here the tri-quadratic elements. The left panel shows the logarithm of the projected mass density field. The right panel shows a map of 
the final refinement level reached by Lagrangian patches; in black regions that did not trigger our refinement threshold, and in white those 
that reached the maximum refinement level allowed in this simulation. 


The evolution was simulated from z = 63 to z = 1 using 
192 global time-steps. As could have been expected, the compu¬ 
tational costs of the different runs differ largely. The standard 
A/’-body run took 81 CPU mins. The tetrahedra, linear and 
quadratic runs took 144, 305, 302 CPU mins, respectively. The 
run with refinement took257 CPU hours. Out of this time, 70% 
was spent in the mapping of mass carriers onto the PM grid, 
and 20% in the on-the-fly generation of mass carriers. 

In Fig. we illustrate the performance of our adaptively- 
refined Lagrangian method. In the left panel, we show the final 
density held in Eulerian coordinates at 2 : = 1. The high level 
of detail that our method provides about the topology of the 
large scale structure in WDM is clearly obvious. One should 
keep in mind that this is while using the equivalent of evolving 


only 64^ points. We highlight that the image displayed is not a 
smoothed rendering of our patches, but it is the actual density 
held used inside our simulation code to gravitationally evolve 
the dark matter huid. On the right hand side image, we show a 
map of the highest rehnement level reached by different regions 
in Lagrangian coordinates (shown as the maximum along the 
line-of-sight). It is clear that the rehnement level correlates 
with the Lagrangian counterpart of dense hlaments and haloes, 
and that the material infalling into the largest haloes will lead 
to the largest amounts of rehnement. 


After this hrst impression of the full performance of our 
approach, we now qualitatively compare our different runs. In 
Fig 19 we show the density held at z = 1 for a 5 x 5 Mpc/h 
region centred on the most massive haloes present in the box. 
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N-body 3 tetrahedra 



tri-quadratic + 1 level ref. tri-quadratic -r 3 level ref. 



Figure 19. The projected mass field in a region of dimensions 
20 X 5 Mpc/h of our WDM simulations at z = 1. Each panel displays 
the field predicted by different methods, as indicated in the top 
left corner. The image corresponds to the logarithm of the actual 
density field used to computed gravitational forces trough the Poisson 
equation. Note that the colour scale is identical in all panels. The top 
four panels have the same number of degrees of freedom (particles), 
while in the bottom two panels, the degrees of freedom increase over 
time to capture mixing in the haloes. 


We present the same image for all the cases mentioned above, as 
indicated by the legend. As in the previous image, this is a not 
a “visualization-purpose” rendering of the particle distribution, 
but the actual density field that sources the gravitational 
potential field trough the Poisson equation in our gravitational 
evolution. 

It is readily visible that all methods coincide in their pre¬ 
dicted large scale structure: filaments, halos and voids coincide 
in their location and extent. However, there are also clear differ¬ 
ences among the methods. Perhaps the most striking difference 
is the evident granularity of the standard A’-body method and 
how filaments are broken into clumps in this case. In contrast 
to this, all of our Lagrangian methods provide an extremely 
smooth field where filaments are continuous ID structures. 
Furthermore, some of these filaments are embedded inside 
larger filaments which shows very clear caustics, none of these 
features can be even distinguished in the standard A-body 
case. Again, we want to emphasise that the top four panels in 
this figure employ the exact same number of “particles”. 




Figure 20. Matter power spectra measured in our WDM simulation 
at three different redshifts, z = 4, 2.3, and 1 from bottom to top. At 
each redshift, the six lines display the results using different methods 
to simulate the evolution of the dark matter density field: standard 
PM A-body (solid black), tetrahedral (dotted red), tri-linear (dashed 
blue), and tri-quadratic elements (dash-dotted green), as well as 
tri-quadratic with one (dash-dot-dot, pink) and three dynamic levels 
of refinement allowed (long dashed grey), as indicated by the legend. 
The Poisson shot noise at z = 1 is indicated by the horizontal dashed 
line. In the bottom panel we display the ratio of the measured spectra 
to the spectra computed by the standard A-body method at 2 ; = 1. 
For the four non-refined methods, the degrees of freedom (particles) 
are identical and and constant and they only differ in the order of 
the mass projection. 

When comparing the runs using the new method, we can 
see that they most strongly differ in the internal structure of 
the massive haloes present in the image. The cases employing 
tetrahedra, tri-linear and even tri-quadratic elements all display 
very concentrated and round halos (with the tri-quadratic 
elements improving slightly). This is a consequence of the biases 
extensively discussed throughout this paper. The situation 
is clearly different when adaptive refinement is enabled and 
the halos appear less dense and more triaxial, very much in 
agreement with the standard A-body case. 

In Fig. we quantify these differences by comparing the 
density power spectrum among our runs. This figure shows 
three groups of curves, from bottom to top: z = 4, 2.3 and 1, 
respectively. Within each set, coloured lines show the results of 
the 6 cases discussed above and indicated by the figure legend. 
At high redshift, we see that all methods are almost indistin¬ 
guishable on large scales. On small scales, the standard A-body 
case strongly deviates from the rest due to the dominating ef¬ 
fect of the primordial lattice, which drives the measured power 
spectrum to the Poisson shot noise level on scales smaller than 
the mean inter-particle separation. At z = 2.3 the differences 
between methods start to appear at /c > 5 h/Mpc. However, 
the contribution of shot-noise in the A-body case is not negligi- 


© 0000 RAS, MNRAS 000 , 000-000 













18 0. Hahn & R. E. Angulo 


ble on such scales. One can however already see that the power 
spectrum of the Lagrangian element methods without refine¬ 
ment is biased high, while those with refinement are consistent 
with the iV-body solution. 

At z — 1 we can then clearly compare the performance 
of the different methods. In agreement with the qualitative 
picture provided by Fig.[^ tetrahedral and tri-linear elements 
lead to a factor of ~ 4 more power than the iV-body case dXk — 
10/i~^Mpc. Tri-quadratic elements and one level of adaptive 
refinement significantly reduce the excess power, are, however, 
still not sufficient to resolve the phase space sheet evolution well 
enough. Clearly, allowing for more levels of dynamic refinement 
solves this problem. When three levels are allowed, then for 
k < 10 Mpc/h the measured spectrum perfectly agrees with 
standard Wbody case. An accurate comparison is not possible 
on smaller scales since the N-body case becomes dominated 
again by the Poisson noise intrinsic to its discrete nature. The 
adaptively refined tri-quadratic elements however probe the 
density field to much smaller scales with no such noise term, 
as we had seen in Figure qualitatively. 


6 DISCUSSION AND CONCLUSIONS 


Numerical simulations of the evolution of the dark matter fluid 
have been and will be the backbone of physical cosmology in 
the late Universe. They are essential to our understanding of 
the formation of structures in the Universe - voids, filaments 
and clusters - as well as of the evolution and formation of 
galaxies and the distribution of dark matter around our Milky 
Way. In addition, they are indispensable in the correct and 
accurate interpretation of virtually all cosmological observables, 
and thus, they are fundamental to almost every aspect of dark 
energy probes and dark matter experiments. 

Methodologically, however, all of these predictions rely on 
only few techniques in the mildly non-linear regime: namely 
perturbative approaches and effective field theories; and only a 
single technique in the highly-nonlinear regime in three dimen¬ 
sions: namely the iV-body method. While it may appear to the 
outsider that there is a wide range of numerical tools available, 
PM, TreePM, fast-multipole A-body, direct N-body, AMR, 
these approaches only differ in how they calculate the forces 
between the particles, but not how they discretise the dark 
matter fluid (see e.g. [Dehnen fc Read|2011 for a recent review 
and discussion of the A-body method for both collisional and 
collisionless dynamics). In all these methods the collisionless 
Vlasov-Poisson system with cold initial conditions is discretised 
in terms of the characteristics of N massive particles. It can 
be shown that as A ^ cx), the A-body system approaches 
the Vlasov-Poisson system, but naturally these simulations are 
far from that limit. As we have argued in length in the Intro¬ 
duction, there are several known short-comings of the V-body 
approach, most notably spurious fragmentation in simulations 
with a cut-off scale in the initial perturbation spectrum, as 
well as two body scattering. Recently, Hahn et al. ( 2013| have 
presented a promising alternative to V-body that was shown 
to avoid spurious fragmentation by performing a tessellation of 
the V-body particles in Lagrangian space and then assigning 
the mass to the tetrahedral elements (whose connectivity is 
preserved throughout the simulation) instead of the vertices 
as in the V-body approach. At the same time, the method 


of Hahn et al. (2013) is limited by the strong constraint that 


the tetrahedra should describe the mass enclosed between the 
particles accurately. This however is a constraint that is known 


to be violated in regions of strong mixing, manifesting itself as 
a density bias in the centres of halos. 

In this paper we present a more general discretisation of 
the phase space sheet in terms of “Lagrangian phase space 
elements” which are a generalisation of both V-body and 
the tetrahedral tessellation approach. We can summarise our 
results as follows: 


(1) We propose a new “Lagrangian phase space element” 
method in Section that describes a piecewise mapping 
between Lagrangian space and Eulerian phase space. We 
specifically discuss tri-polynomial maps in this paper, but 
other functional forms are possible. The Jacobian of the 
map to configuration space defines a unique density for 
every point in the element, and the projection of all Jaco- 
bians provides a density field in Eulerian space of arbitrary 
regularity, determined only by the order of the chosen 
polynomials. Specifically we show that the tetrahedral ap¬ 
proximation of Hahn et al. (2013) is a low order version 
implementing only a piecewise linear transformation be¬ 
tween the two spaces. We consider explicitly tri-linear and 
tri-quadratic elements here, the latter giving a continuous 
density field and the latter two being able to represent 
caustics on a single-element level. 

(2) We then discuss a simple pseudo-particle discretisation 
of the elements. This simplifies the problem of projecting 
high-order elements that may contain caustics onto a grid 
in order to solve for the gravitational field. The pseudo¬ 
particle approach also facilitates the interfacing of our 
method with existing V-body codes. 

(3) The phase-space elements map from cuboid domains in 
Lagrangian space to Eulerian space. We show that by sub¬ 
dividing these domains, one can construct a Lagrangian 
oct-tree based AMR scheme that allows for dynamic adap¬ 
tive refinement of the elements. 

(4) In Sections 3.1 and 3.2 we demonstrate that in a simple 
phase-mixing test in a fixed potential dynamic adaptive re¬ 
finement allows us to conserve energy and represent phase¬ 
mixing exactly for the fine-grained distribution function by 
inserting new vertices. At the same time, new quantities, 
not accessible for an iV-body simulation can be calculated. 
We show, e.g., the growth of the total configuration space 
volume of the distribution function over time. 

(5) Next, in Section [ tT] we show that we approach second 
order convergence with the quadratic elements at fixed 
vertex number and increasing force resolution in the plane 
wave collapse test. In the following sections, we provide 
further one- and two-dimensional tests that demonstrate 
the high accuracy of our approach and the excellent perfor¬ 
mance of the Lagrangian phase-space elements compared 
to A-body. 

(6) Finally, in Section]^ we argue that our method is mature 
and efficient enough for its application to cosmological 
structure formation problems. As an example, we simulate 
the evolution of the dark matter fiuid with a small-scale 
cut-off in its initial power spectrum. We explicitly show 
that our method solves the artificial fragmentation problem 
while providing an unbiased solution for the density field 
in regions of heavy distortions of Lagrangian elements. 


Clearly, this new approach will enable us to follow the full fine¬ 
grained distribution function throughout gravitational collapse 
in the formation history of a halo. We expect that knowledge of 
the distribution of dark matter in six-dimensional phase-space 
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will provide new insights into virialisation as well as relaxation 
processes in such collisionless self-gravitating systems. 
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